SMILES Extensions for Pattern Matching and Molecular Transformations: Applications in Chemoinformatics
نویسندگان
چکیده
The selection and modification of atoms or functional groups underly many of the manipulations central to molecular modeling. It has become even more important to automate these tasks with the current prevalence of work with large databases of molecules. We have devised SUPER-SMILES, a conceptually simple set of extensions to the SMILES line notation, whose key features are addition and deletion facilities, macros, atom tagging, disjunctions, and constraints. This superset of SMILES enables us to carry out transformations on individual molecular structures or across members of a database with a pattern-matching protocol. The principal advantage of SUPER-SMILES is the ability to specify chemical reactions with a very simple augmentation of the SMILES line notation. For example, in conjunction with macros, it is possible to represent the displacement of tosylate with phenoxy by the expression “(Delete Tosyl) (Add Phenoxy)”. SUPERSMILES thus represents a unified approach to molecular structure specification and modification and can easily be applied to large datasets of molecules. This functionality has been implemented within the PROMETHEUS suite of CAMD programs. We demonstrate its use in carrying out such operations as atomtype assignment, protonation of molecules, valency checking, and hydrogen addition. Further applications such as library design and construction immediately suggest themselves.
منابع مشابه
Entropy of infinite systems and transformations
The Kolmogorov-Sinai entropy is a far reaching dynamical generalization of Shannon entropy of information systems. This entropy works perfectly for probability measure preserving (p.m.p.) transformations. However, it is not useful when there is no finite invariant measure. There are certain successful extensions of the notion of entropy to infinite measure spaces, or transformations with ...
متن کاملEvaluation of Similarity Measures for Template Matching
Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...
متن کاملA Fully Reversible Data Transform Technique Enhancing Data Compression of SMILES Data
The requirement to efficiently store and process SMILES data used in Chemoinformatics creates a demand for efficient techniques to compress this data. General-purpose transforms and compressors are available to transform and compress this type of data to a certain extent, however, these techniques are not specific to SMILES data. We develop a transform specific to SMILES data that can be used a...
متن کاملJmol SMILES and Jmol SMARTS: specifications and applications
BACKGROUND SMILES and SMARTS are two well-defined structure matching languages that have gained wide use in cheminformatics. Jmol is a widely used open-source molecular visualization and analysis tool written in Java and implemented in both Java and JavaScript. Over the past 10 years, from 2007 to 2016, work on Jmol has included the development of dialects of SMILES and SMARTS that incorporate ...
متن کاملTreelet kernel incorporating cyclic, stereo and inter pattern information in chemoinformatics
Chemoinformatics is a research field concerned with the study of physical or biological molecular properties through computer science’s research fields such as machine learning and graph theory. From this point of view, graph kernels provide a nice framework which allows to naturally combine machine learning and graph theory techniques. Graph kernels based on bags of patterns have proven their ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Chemical Information and Computer Sciences
دوره 39 شماره
صفحات -
تاریخ انتشار 1999